Identification of new bacterial type III secreted effectors with a recursive Hidden Markov Model profile-alignment strategy
نویسندگان
چکیده
To identify new bacterial type III secreted effectors is computationally a big challenge. At least a dozen machine learning algorithms have been developed, but so far have only achieved limited success. Sequence similarity appears important for biologists but is frequently neglected by algorithm developers for effector prediction, although large success was achieved in the field with this strategy a decade ago. In this study, we propose a recursive sequence alignment strategy with Hidden Markov Models, to comprehensively find homologs of known YopJ/P full-length proteins, effector domains and N-terminal signal sequences. Using this method, we identified 155 different YopJ/P-family effectors and 59 proteins with YopJ/P N-terminal signal sequences from 27 genera and more than 70 species. Among these genera, we also identified one type III secretion system (T3SS) from Uliginosibacterium and two T3SSs from Rhizobacter for the first time. Higher conservation of effector domains, N-terminal fusion of signal sequences to other effectors, and the exchange of N-terminal signal sequences between different effector proteins were frequently observed for YopJ/P-family proteins. This made it feasible to identify new effectors based on separate similarity screening for the N-terminal signal peptides and the effector domains of known effectors. This method can also be applied to search for homologues of other known T3SS effectors.
منابع مشابه
A generalization of Profile Hidden Markov Model (PHMM) using one-by-one dependency between sequences
The Profile Hidden Markov Model (PHMM) can be poor at capturing dependency between observations because of the statistical assumptions it makes. To overcome this limitation, the dependency between residues in a multiple sequence alignment (MSA) which is the representative of a PHMM can be combined with the PHMM. Based on the fact that sequences appearing in the final MSA are written based on th...
متن کاملUsing Weakly Conserved Motifs Hidden in Secretion Signals to Identify Type-III Effectors from Bacterial Pathogen Genomes
BACKGROUND As one of the most important virulence factor types in gram-negative pathogenic bacteria, type-III effectors (TTEs) play a crucial role in pathogen-host interactions by directly influencing immune signaling pathways within host cells. Based on the hypothesis that type-III secretion signals may be comprised of some weakly conserved sequence motifs, here we used profile-based amino aci...
متن کاملEffective Identification of Bacterial Type III Secretion Signals Using Joint Element Features
Type III secretion system (T3SS) plays important roles in bacteria and host cell interactions by specifically translocating type III effectors into the cytoplasm of the host cells. The N-terminal amino acid sequences of the bacterial type III effectors determine their specific secretion via type III secretion conduits. It is still unclear as to how the N-terminal sequences guide this specificit...
متن کاملIdentification of Novel Type III Effectors Using Latent Dirichlet Allocation
Among the six secretion systems identified in Gram-negative bacteria, the type III secretion system (T3SS) plays important roles in the disease development of pathogens. T3SS has attracted a great deal of research interests. However, the secretion mechanism has not been fully understood yet. Especially, the identification of effectors (secreted proteins) is an important and challenging task. Th...
متن کاملاستفاده از مدل مارکوف پنهان در پیشبینی موارد جدید سل در استان همدان بر اساس اطلاعات موارد ثبت شده طی سالهای 94-1384
Background and Objectives: Tuberculosis is a chronic bacterial disease and a major cause of morbidity and mortality. It is caused by a Mycobacterium tuberculosis. Awareness of the incidence and number of new cases of the disease is valuable information for revising the implemented programs and development indicators. time series and regression are commonly used models for prediction but these m...
متن کامل